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Abstract: A real time group editor allows multiple users to edit the same text at the 
same time from multiple sites across Internet. The real time group editors community has 
developed a framework called Operational Transformation (OT) for maintaining consistency 
of shared data. OT differs from other optimistic replication systems by not only ensuring 
content consistency but also intention consistency. 

In this paper, we describe the WOOT (WithOut Operational Transformation) framework 
that ensures intention consistency without following the OT approach. However, thanks to 
its new viewpoint, WOOT is drastically simpler, more efficient and does not require vector 
clocks or central sites. The WOOT framework is particularly adapted to very large peer-to- 
peer networks. 
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Editeurs synchrones sans transformees operationnelles 

Resume : Un editeur synchrone permet & plusieurs utilisateurs de modifier le ineme 
document au m§me moment depuis plusieurs sites a travers Internet. La communaute des 
editeurs synchrones a developpe un environnement appele transformees operationnelles (OT) 
pour maintenir la coherence des donnees partagees. OT se diflerencie des autres approches 
de replication optimiste en assurant en plus de la convergence des donnees, le respect de 
l’intention. Dans cet article, nous decrivons l’environnement WOOT qui garantit la cohe¬ 
rence des intentions mais sans adopter l’approche des transformees operationnelles. Cette 
nouvelle approche est largement plus simple, plus efficace et ne requiert ni vecteurs d'horloge 
ni sites centraux. L’environnement WOOT est particulierement adapte aux rnseaux pair a 
pair & large echelle 

Mots-cles : Editeurs temps reels, replication optimiste, transformees operationnelles, 
coherence des donnees reparties 
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Figure 1: Intention violation problem 


1 Introduction 

A real time group editor Bln. allows multiple users to edit the same text at the same time 
from multiple sites across Internet. In order to achieve high responsiveness, shared data are 
replicated on all sites. In order to achieve unconstrained interactions, there are no locking 
or serialization protocols. Any user can edit the text at any time. If two users generate 
concurrent operations, the systems have to ensure that replicate will converge to the best 
possible state. 

Group editors are based on the Operational Transformation (OT) model. In this model, 
an operation generated on one site is executed locally immediately and next broadcasted 
to other sites to he re-executed. OT model traditionally ensures the consistency model de¬ 
fined by G2- OT algorithms ensure Causality, Convergence and Intention preservation. If 
Causality and convergence are very common in optimistic replication[3, intention preserva¬ 
tion is more unusual. Intention preservation as defined in m ensures that (1) the effect of 
executing any operation o at remote site achieves the same effect that executing o when it 
was generated (2) execution effects of concurrent operations do not interfere. 

Intention-violation problem has been introduced in the REDUCE approach |TU]. Two 
sites share a string containing "ABCDE" (cf figure [T(a)| . Site 1 inserts "12" at position 2 and 
obtains "A12BCDE". Site 1 has executed operation o\ = ins (2, ”12”) with the intention to 
insert "12" between A and B. Site 2 deletes one character at position 3 and obtains "ABDE". 
Site 2 has executed operation 02 = del(3) with the intention to delete the character ’C’. 

If we execute both operations and preserve intentions, we must obtain "A12BDE". But 
if we use a serialization protocol, it can decide to serialize 01 before o 2 . The final result 
in this case is "A1BCDE" (cf figure |l(b)) . Convergence is achieved but intentions are not 
preserved. 

Recently, Li et algj Ej introduced the notion of operation effects relation preservation. 
For example if a user inserts a character ’b’ between ’a’ and ! c ! , we must ensure that in 
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the convergence state, ’b’ will be between ’a’ and ’c’. If it exists an order relation between 
two operations effects on a state, then this order relation must be preserved when both 
operations are re-executed on any state. 

Currently, only the SDT algorithm ensures intention consistency as defined in [o]. SDT 
does not require a central site. However, SDT is a complex algorithm and requires vector 
clocks. Vector clocks do not scale because their size axe proportional to sites count. Using 
vector clocks is a major drawback for deploying Real time groupware on large peer-to-peer 
networks. 

In this paper, we present a new framework called WOOT (WithOut Operational Transfor¬ 
mation) that ensures Intention consistency as defined by Li[S] but without operational trans¬ 
formations, without vector clocks and without central sites. WOOT is particularly adapted 
to very large peer-to-peer networks, drastically simpler than SDT and easy to implement. 


2 WOOT approach 

Traditional Optimistic replication approaches ensure convergence of replicas |7J. The OT 
approach ensures more than convergence and defines the notion of intention consistency 0. 
Intention consistency means that operation effects on generation state are preserved when 
operation is re-executed on remote sites. But what is the operation effect of a simple 
insert operation on a string? Sun describes it when he presented the intention violation 
problemJT^]. When a user observes the string "ABCDE" and inserts "12" at position 2, he 
inserts "12" between A and B. Intention consistency means on this example that if "12" 
has been inserted between A and B, this ordering A ”12” -< B must be preserved on 
any further states. Currently, only the SDT algorithm can ensure that these orderings are 
preserved. If we analyze SDT, we can see that the main problem is to determine these 
ordering relations just by analyzing the log of operations. 

The WOOT approach is fairly simple. Instead of re-computing orderings at reception, 
we send orderings because we know this information when operations are generated. 

The immediate effect is straightforward. Instead of executing and broadcasting insert( 2, ” 12”) 
as in OT algorithms, we execute insert(2,”12”) and broadcast insert^A' -< ”12” -<' B'). 

The first problem is what to do if we receive insert('A' -< ”12” B') and ’B’ has been 

locally deleted. The WOOT approach is drastically simple: ’B’ will exist because we do not 
delete character, we mark it as invisible. 

Of course, if we do not delete characters, it requires more memory and generates bigger 
files. But we show in this paper that in fact, with the WOOT approach, we don’t need to 
keep the log of operations with vector clocks. Finally, the space complexity of WOOT is 
lesser than comparable OT algorithms. 

Each insert operation generates two new order relationships, but it does not generate a 
total order, just a partial order. For example, consider three sites and each site generates 
one operation as presented in figure |2 We represent order relationships between characters 
in the Hasse diagram of figure EJ 
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Figure 3: Hasse diagram of order relation between characters 


Acceptable states are all linear extensions of this partial order i.e. a312b, a321b, a231b. 
To achieve convergence, all sites must find the same linear extension. Traditionally, a topo¬ 
logical sort is used to find a linear extension of a partial ordered set. However, we cannot 
use it in our context. Each time an operation is received, linearization must be done and a 
new state can be observed by the user. A new linearization on a site must be compatible 
with all previous one. It means that if a linearization determined that ’a’ is before ’b’, next 
linearization cannot linearize ’b’ before ’a’. 

We apply a topological sort on: 

First, the only node without predecessor is ! a\ We push it on the result stack and remove 
it with its edges from the graph. Then next nodes without predecessors are T and ! 2’. We 
need a choice axiom and we take the least. So we push T on the result stack and remove 
it with its edges from the graph. We repeat these operations until the graph is empty. The 
result of this topological sort is "al2b". 

The operation insert(a -< 3 -< 1) is received on the same site and we re-execute the 
topological sort on the graph of figure [31 We obtain "a231b" which is incompatible with the 
previous linearization i.e. 2 is now before 1. 
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The challenge of the WOOT framework is to ensure convergence with a monotonic lin¬ 
earization function. We focus on this paper on sequence of characters but it is clear that 
the WOOT framework supports any linear structure. A linear structure can be complex i.e. 
an ordered tree is a linear structure. 


3 WOOT Framework 

In this section, we present the whole WOOT model and algorithms. We formally define 
the data structure used by WOOT and the order relations taken in account to linearize the 
characters. Thus, we explain the algorithms used by the WOOT framework. Then, the 
framework is evaluated in terms of correctness and complexity. 

3.1 Definition of the PCI Consistency model 

A group editor is consistent iff : 

Precondition Preservation Operations axe integrated if their preconditions axe txue. 

Convergence When the same set of operations has been executed at all sites, all copies of 
the shared document are identical. 

Intention Preservation For any operation 0, the effects of executing O at all sites are 
the same of executing 0 on generation state. 

Compared to the original consistency model P], we do not require causality. An operation 
can be integrated as soon as its preconditions are true. For example, a user executes locally 
0 \ = ins(a V -< b), broadcasts it and next 02 = ins(c -<' 2' -< d) and broadcasts it. 
Another site can receive 02 and next 01. Clearly, if causality is required, 02 must be executed 
after o\. We allow executing 02 immediately iff preconditions of 02 are verified i.e. if ’c’ and 
’d’ exist locally. 

On one hand, we allow higher concurrency, and it is important for the high-responsiveness 
requirement, on the other hand, if a hidden dependency exists between o\ and 02, it will be 
violated. We have chosen higher concurrency considering that it is acceptable if intentions 
are preserved. This choice is not so important. It is possible to use WOOT with a causal 
reception. If causal reception often involves vector clocks, vector clocks are not required for 
ensuring effect preservation and convergence. 

The PCI consistency model is similar to the Sun CCI consistency model|T2| except 
that causality has been replaced by preconditions. Compared to the Li CSM consistency 
model jS], we do not adopt the multi-operation effect relation preservation. We think that 
relation between operation effects are already captured by the intention preservation rule. Li 
wrote that intention preservation and multi-operation effect preservation imply convergence. 
We think that intention preservation and convergence imply also multi-operation effects 
preservation. At last, we are convinced that CSM, CCI and PCI consistency models are 
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designed to converge to the same final state. We just differ from CCI and CSM by not 
requiring traditional causality rule. 

3.2 Data model 

Each site s gets a unique identifier numSite s , a logical clock H s , a sequence string s of W- 
characters and a pool of pending operations pool s . 

Definition 1 A W-character c is a five-tuple < id,v,a, id cp ,id cn > where 

• id is the identifier of the character. 

• v £ {True, False} indicates if the character is visible, 

• a is the alphabetical value of the effect character, 

• id cp is the identifier of the previous W-character of c. 

• id cn is the identifier of the next W-character of c. 

The previous and next W-characters of c are the W-characters between which c has been 
generated. 

Definition 2 The previous W-character of a W-character c is denoted Cp(c). The next 
W-character of a W-character c is denoted CW(c). 

To identify in a unique way the characters we use an identifier based on a site number 
and a local clock. 

Definition 3 A character identifier is a couple (ns, ng) where ns is the identifier of a site 
and ng is a natural number. When a character is generated on a site s, its identifier is fixed 
at ( numSite s ,H s ). 

Each time a W-character is generated on a site s, the local clock H s is incremented. 
Since numSite is unique, the couple (numSite s , H s ) forms a unique identifier for a character. 
string s is a W-string. It contains all the integrated W-characters. 

Definition 4 A W-string is an ordered sequence of W-characters CVPife • • ■ c n C e where Cb 
and C e are special W-characters (with special identifiers) marking the beginning and the 
ending of the sequence. 

We define the following functions for a sequence S: 

• | S | denotes the length of the sequence S. 

• S[p] denotes the element at the position p in S. We state that the first element of a 
sequence S is at position 0 and the last element is at position IS] — 1. 

• pos(S, c) returns the position of the element c in S as a natural number. 
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• insert(S , c,p) inserts the element c in S at position p. 

• subseq(S, c, d) returns the part of the sequence S between the elements c and d, both 
not included. 

• contains (S, c) returns true if c can be found in S 

We also need the following functions to link the W-string and the string the user sees. 

• value(S) is the representation of S (i.e. the sequence of visible alphabetical values). 

• ithVisible(S, i ) is the i th visible character of S. 

Two operations update a W-string: 

ins(c) inserts the W-character c between its previous and next characters. The precondition 
is previous and next characters exist. 

del(c) deletes the W-character c. The precondition of del(c) is c exists. 

3.3 Orders 

Definition 5 Let a and b two W-characters. a -< b if and only if there exists a set of 
characters co, ci, ...c* such that a = co,b = c* and Cn{cj) = Cj+i or Cj = Cp(Cj+i) for all 
0 <j<i. 

-< is a binary relation over the set of W-characters. -< is irreflexive, transitive and 
asymmetric. -< is a strict partial order. 

To obtain a string from this partial order, we have to find a linear extension (i.e. a total 
order). 

Definition 6 Let S be a sequence, the relation <s is defined as a <s b if and only if 
pos(S , a) < pos(S, b) 

When no precedence relation can be established between two characters, we have to 
order them. To ensure convergence, this order must be set independently from the state of 
the site. We use the characters identifier. 

Definition 7 Let a and b two W-characters with their respective identifiers ( ns a ,ng a ) and 
(nsi,ngt). a < id b if and only if (1) ns a < nst or (2) ns a = nsb and ng a < ngb- 

3.4 Algorithms 

When a site generates an operation, the operation is integrated locally, broadcasted and 
then integrated by all other sites. The reason to the local integration is to take into account 
the possible invisible characters i.e. previously deleted characters. 
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Generation. For an operation op, type(op) denotes the type of the operation: del or ins. 
char (op) denotes the character manipulated by the operation. 

When user interacts with the framework, he only sees value(S). So, when an insert 
operation is generated the user-interface only knows the visible position and the alphabetical 
value of the character to insert. For instance, ins( 2, a) in "xyz" is translated into ins(y -< 
a -< z ). _ 


Generatelns (pos, a) 


H s :=H S + 1 


letA c p := ithVisible(strings,pos), 


c n := ithVisible(strings,pos + 1), 


A wchar :=< (numSite s ,H s ), True,t 

x, c p .id, c n .id 

Integratelns(wcAar, c p , c„) 


broadcast ins (wchar) 



Similarly, when a delete operation is generated we have to retrieve the W-character at 
this position. 

GenerateDel (pos) 

letA wchar := ithVisible(string s ,pos) 

IntegrateDel (wchar) 
broadcast del (wchar) 

Reception Sites may receive operations with unverified preconditions. The isExecutable 
function checks preconditions of an operation. 
isExecutable (op) 
let c := char(op) 

if type(op) = del then return contains (stringc) 
else return contains (stringCp(c)) and 
contains (string s, CW(c)) 

endif _ 

To deal with pending operations each site maintains a pool of operations. 



For instance, a site executes del(c) only if c is present. If c is not present, the integration 
of the operation is delayed until c is present. 

Main() 

loop 

find op in pool s s.t isExecutable(op) 
let c := char(op) 

if type(op) = del then IntegrateDel(c) 
else Integratelns(c, Cp(c), Cn(c)) 

endif 
end loop 
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Example 1 According to the scenario of figure^ site 3 delays the integration of 02 until 
reception of o\. 

However, site 1 can execute 03 before receiving o 2 even if o 2 happened before 03. 

In traditional approach, that uses vector clocks, the execution is delayed in both cases. 



Figure 4: Precondition preservation 


Integration To integrate an operation del(c), we only need to set the visible flag of the 

character c to false , whatever the previous value. _ 

IntegrateDel(c) 

c.v := False _ 

To integrate an operation ins(c) in string s , we need to place c among all the characters 
between c p and c„. These characters can be previously deleted characters or characters 
inserted by concurrent operations. When an operation ins(c) is executable on a site, the 
procedure Integratelns(c, c p , c„) can be called. 
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Integratelns(c, c p ,c n ) 
let S := strings 
let S' := subseq(S, c p , c„) 
if S' = 0 then insert(S, c,pos(S, c„)) 

else 

let L := c p dodi... dmC n where do ■■■ d m are the 
W-chars in S' s.t. Cp(di) <s c p and c„ <5 

C N (di) 

let i := 1 

while (i < \L\ - 1) and (L[i\ < id c) do 
i := 4 + 1 

Integratelns(c, L[i - 1], L[i]) 

endif _ 

The algorithm orders characters with < id when no precedence relation + is available. 
Thus, the algorithm removes from S' the characters that have a previous or next character in 
S'. Indeed, such characters are ordered by the precedence relation and not by the identifier 
ordering. 

Because characters in dodi...d m are ordered by the relation < id , the algorithm inserts c 
at its place according to <i d . However, there may be some characters in S between L[i — 1] 
and L[i], Thus, we call recursively the integration function. 

3.5 Examples 

Let site 1, site 2 and site 3 be three sites in the initial state "ct,c e ". We consider the following 
scenario: 



This scenario generates the following Hasse Diagram: 
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3 4 

Lets assume that ’1 ! <id ’2 ! < i( j ! 3 ! <id ’4’. Site 2 integrates o\, 03, 04 in this order: 

1. Integratelns( 1, c&, c e ): S' = L = ”2” and 1 < i( j 2, WOOT integrates 1 between c& and 

2. During the recursive call Integratelns( 1, cj,2), we get 5" = 0. Thus we compute 
" Cf,12c e ". 

2. Integratelns(3, Cf,, 1): 5' = 0, WOOT inserts 3 between c& and 1. 

3. IntegrateIns(A, 1, c e ): S' = L = ”2” and 2 < sd 4, WOOT integrates 4 between 2 and 
c e . We obtain "c(,3124c e ". 

Site 3 integrates 02. 



During Integratelns(2, c&, c e ): S' = ”314” and Cn( 3) = Cp(4) = 1, we get T = 
”1”. As 1 <id 2, WOOT integrates 2 between Ci and c e . During the recursive call 
Integratelns(2,l,c e )\ S' = L = ”4” and 2 4, WOOT integrates 2 between 1 and 4. 

Site 3 obtains "c&3124c e ". 

On site 1, whatever the order of arrival of 02, 03, 04, WOOT computes Cf,3124c e . In every 
case, the final string is "3124" and PCI consistency is ensured. 

However, it is possible to generate more complex scenario for the WOOT integration pro¬ 
cedure. We can produce a scenario with 7 sites. Each site generates a character containing 
its site identifier. We can generate the following Hasse diagram: 
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6 



Suppose characters 0,1,2,3,4 already received on a site. First, we simulate reception of 6 
then 5. Next, we will simulate reception of 5 then 6 and verify that we find the same result. 

The integration procedure produced the string "12034". 1 is linearized before 2 because 
the 1 < id 2 and 3 is linearized before 4 because 3 < id 4. 2 is before 0 and 3 is after 0. 

We integrate now ins( 1 -< 6 -< 3) in "12034". S' = ”20” and L = ”0”. On the following 
schema, we can see that only 0 has Cp( 0) <s Cp{ 6) and CW(6) <5 CW(0). 



As 0 <id 6, we linearize 6 between 0 and 3. We obtain "120634". 

Now, the same site receives ins( 2 -< 5 -< 4) and integrates 5 in string "120634". S' — 
”063” and L = ”0”. As 0 <j d 5, we integrate recursively 5 between 0 and 4. S' = ”63” and 
L = ”3”. As 3 < id 5, 5 is between 3 and 4. So the final result is "1206354". 

We restart the scenario on the "12034" state but now we receives ins( 2 5 -< 4) first. 

S' = ”03” and L = ”0”. As 0 <i d 5, we integrate recursively 5 between 0 and 4. S' = L= ”3” 
and 3 <i d 5, we obtain "120354". 

Now we integrate ms( 1 -< 6 -< 3) on "120354". S' = ”20” and L = ”0”. As 0 < id 6, 6 is 
between 0 and 3. We obtain "1206354" as expected. 

4 Correctness and Complexity 

First, we check that WOOT ensures the PCI consistency model. Next we evaluate the time 
and space complexity of WOOT. 

4.1 Correctness 

Theorem 1 The algorithm of integration terminates. 
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Proof 1 Proof by contradiction. 

The algorithm does not terminate if and only if the recursive call is not done on a strictly 
smaller subsequence. This can happens only if we get, a non-empty S' and an L = c p c n . 
If L = c p c n , every character in this S' have its predecessor or its successor in S'. Since 
the characters have been generated in a strict order, at least the firstly integrated character 
between c p and c„ has its previous and its next character outside S'. 

Thus, there are at least 3 characters in L. So the recursive call is done on a strictly 
smaller subsequence and the algorithm terminates. 

Precondition preservation It is obvious to show that, since an operation op is integrated 
only if isExecutable(op) is true, precondition preservation is ensured. 

Intention preservation Our linearization order must respect the precedence order de¬ 
fined when operations are generated. 

Theorem 2 The relation < s induced by WOOT on each site is a linear extension of the 
relation 

Proof 2 The -< relation is never modified after the generation of an operation. The lin¬ 
earization <s is only modified through Integratelns. The integration of a character c is 
always done by the insertion of c between Cp(c) and CV(c). Thus < s is a linear extension 

of <■ 

However, ensuring intention preservation is not enough. For instance, if two sites insert 
respectively ’a’ and ! b ! between the same characters ’x’ and ’y’, we can obtain two different 
strings on the two sites; respectively "axyb" and "ayxb". These two linear extensions respect 
intention preservation but they do not converge. 

Convergence We do not have a hand-written proof of convergence. To verify the cor¬ 
rectness of our algorithm we have used the model-checker TLC on a specification modelled 
on the TLA+ specification language M- The model-checking techniques are particularly 
suited to verify concurrent systems. With the TLC model checker we verified a bounded 
version of our framework. Due to the famous state explosion problem, it is impossible in 
practice to test a system with a big amount of sites and characters. We made complete ver¬ 
ifications up to four sites and fives characters. It took about two weeks with a Pentium(R) 
4 CPU 2.80GHz. Section El contains the complete TLA specification. 

Convergence requires that two characters on two sites are linearized in the same order. 
This conjecture and the fact that every generated character will be inserted in every site 
ensure the convergence criteria. 

Conjecture 1 Let Si and S 2 two W-strings maintained by two different sites. For every 
pair {c, d} of characters whose appear on both sites, we get 

c <Si d <£> c <s 2 d 
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To reduce design complexity of the specification and to accelerate model-checking, we 
made two slight generalizations. The del operation and its integration do not appear in 
the model since this operation does not affect the linearization order. However, to simulate 
deletion of characters, we allow generating a ins(c p -< c -< c„) in S requiring only c p <s c„ 
i.e. as if the characters between c p and c„ were deleted. 

The second generalization consists in representing characters simply by an identifier. This 
is also a generalization since in the model any sites can generate any character identifiers. 

The TLC model-checker found an error in a previous naive version of the integration 
algorithm. In this version, we did not filter S' to obtain L. We thought that since all the 
characters between c p and c„ were concurrent to c we simply have to order them according 
to <id. The model checker found the counter example presented in figure El This counter¬ 
example helped us to design the current version of the integration algorithm. 

4.2 Complexity 

We evaluate algorithmic complexity of WOOT in function of n the operations count al¬ 
ready generated by all sites. The following theorem express that size required by WOOT is 
proportional to n. 

Theorem 3 WOOT space complexity is O(n). 

Proof 3 The size of the local clock and site identifier are constant (0(1 )) 

The pool contains k operations. We get that k < n. 

The size of a W-character is constant 0(1). there are m <{n-k) already integrated insert 
operations. The W-string begins with two characters (cs and c e ). Delete operations do not 
affect the size of the W-string and each insert operation adds a unique W-character. Thus, 
the size of a W-string is proportional to m + 2. 

Thus, the WOOT space complexity is proportional to k + m and, since k + m < n, the 
WOOT space complexity is 0(n). 

Theorem 4 The worst case time complexity of integration is 0(n 3 ). 

Proof 4 Let m be the size of a string s ; The integration of a del(c) operation takes a linear 
time 0(m). We have to scan the W-string to find the identifier of c. 

For an ins(c) operation, we have the following statements and their worst-case time 
complexity: 

S' = subseq (...) (0(m)) 
either insert (...) (0(m)) 
or 

- filter S' to obtain L : since <s is in 0(m), it takes 0(m 2 ). 

— while loop to find i (0(m)) 
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- recursive call. 

There are at most m recursive calls to Integratelns. Thus, time complexity is 0(m(m + 
max(m, m 2 + m))) = 0(m 3 ) = 0(n 3 ) because m < n. 

We supposed that finding a W-character or comparing two W-characters position in W- 
String is in O(n). If we maintain an index of identifiers in a W-String, we can improve the 
WOOT average time complexity to 0(n 2 ). It will take more space but the space complexity 
will be still in O(n). 

5 Related work 

In the OT approach, just few algorithms ensures the intention consistency model as defined 
in |3- Often, they fail on the famous TP2 puzzle. The starting point of TP2 puzzle is very 
simple. Three sites share a string "abed". Site 1 executes 0 \ = ms(3/ x'), site 2 executes 
02 = del(2) and site 3 performs 03 = ins (2, y). Everybody see immediately that ! x ! is after 
! c ! , ! y ! before ! c ! and ! b ! is destroyed. Then, the convergence state should be "ayexd". Why 
is it so complex to solve this problem with OT ? The problem looks simple because we 
can observe all sites simultaneously, an OT algorithm run on each site and must build its 
state with just the knowledge of its log. When operation o\ is received on site 2, 0 \ do not 
contain the information: "x should be after c". The OT algorithm has to recompute this 
information with its log. This leads OT algorithms to solve complex problems and propose 
complex solutions. If we take the WOOT viewpoint on TP2 puzzle, we obtain the following 
Hasse diagram: 



V ' 


The partial order is already a total order. The only linear extension of this graph is 
"abyexd". As ’b’ is not visible, the final string is "ayexd" as expected. 

SOCT4[H] maintains the same transformation path on all sites by using a continuous 
global total order built with a centralized timestamper and deferred broadcast. Li in 0 wrote 
that SOCT4 cannot ensure operation effects relation preservation and gives an example p458. 
We redo the example and found that, on this example, operation effect relation preservation 
is preserved. We claim that it is not proven that SOCT4 cannot ensure the intention 
consistency model. However, it is clear that the centralized timestamper is not compatible 
with peer-to-peer environment. If we compare WOOT and SOCT4, WOOT do not require 
a central site. 

SOCT3|S] also maintains a continuous global total order using a centralized timestamper 
but does not require deferred broadcast. SOCT3 uses a separation procedure for resolving 
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partial concurrency g] that requires backward transformations. Using backward transfor¬ 
mation without requiring TP2 on forward transformation is unsafe. So SOCT3 mandates 
TP2 gHj and TP2 transformation functions are not provided (see figure EJl . 

GOT |T2| maintains a non-continuous global total order without using a centralized 
timestamper. Non-continuous global order forces the GOT algorithm to undo/redo some 
operations. Undo/redo is not really adequate for the high-responsiveness requirement of 
real-time editor. GOT uses exclusion transformations to solve partial concurrency. Unlike 
SOCT3, GOT calls exclusion transformation on two causally dependent operations. Exclu¬ 
sion transformation is not always defined in this case. This can lead to inconsistencies. 

Other OT algorithms rely on satisfaction of TP1 and TP2 as defined in g] for their trans¬ 
formation functions. Currently, existing transformation functions from Resselg], Sun[T2] 
and IMORg] violate TP2. Counter example for IMOR appears in gj, p 465 and counter 
examples for Ressel and Sun appear in g], The scenario for IMOR is illustrated in figure 0 



If we take the WOOT viewpoint on this scenario, we obtain the following Hasse diagram: 



a - 


All characters are already totally ordered, so all sites will converge to "0acb23". 

There is no counter-example for g] . Figure g] presents a counter-example for Suleiman 
transformation functions. This scenario violates both TP2 and intention preservation. Com¬ 
plete transformation functions from Suleiman appear in g|gj. In fact, we generate a classical 
TP2 puzzle between sites 1,2 and 3. Sites 4 and 5 are just here to fill the before and after 
sets of operations defined by Suleiman. 
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The WOOT viewpoint on this counter-example is presented in the following Hasse dia¬ 
gram: 



On site 2, when ins(b -< x -< d) from site 1 is received, we integrate the ’x ! in the 
current state "abxyyd". S' = xcy and L = c. We assume that c < id x. We recursively call 
integration of x between c and d. S' = L = y and x < i( i y, so we obtain "abx^xyd". The 
same integration scenario is executed on site 3 and returns the same result. 

In Li defines transformation functions that satisfy TP1, TP2 and a new integration 
algorithm SDT. When SDT transforms two concurrent insert operations 0 \ and Oi defined 
on the same state s , SDT computes /3isp(oi) and Pi sp (o2) which are the position of 0 \ and 02 
on a state called Last Synchronization Point (LSP). LSP is the state identified by the vector 
clock V m in = min(v(oi),v(o 2 )). v(oi),v(o 2 ) are vector clocks of 01, 02. V m in is built with 
the minimal value of each component of the vector clocks of 01, o 2 . First, SDT computes 
the sequence SQ of operations that generates state s starting from state LSP. Then, SDT 
computes another sequence SD which is sequence equivalent to SQ but that only contains 
the net effect between LSP and s. Next, SDT excludes SD from o\ and o 2 and obtains 
positions of 0 1, 02 on state LSP. Comparing /3(oi) and /3(o2) allow breaking the tie while 
ensuring intention consistency. 
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If wg compare WOOT and SII'. it is clear that WOOT is much simpler. Unlike SDT, 
WOOT does not require vector clocks. At first sight, WOOT seems to require more space 
than SDT, but it is false. SDT implies that each site keep a log of executed operations. 
Each operation keeps a vector clock. For n generated operations, SDT has a log of size n. 
if m is the number of sites, a vector clock size is O(m). So the space complexity of SDT is 
0(n * rn). The WOOT space complexity is O(n). 

Tendax 0 [2] is a collaborative editor relying on a database. WOOT and Tendax have 
the same operations profile. Also, thanks to its centralized approach, it does not require 
vector clocks. In Tendax, each character is stored with a unique identifier, the identifier of 
the previous character and the identifier of the next character. However, Tendax removes 
characters and Tendax behavior is not formally described in this case. Compared to Tendax, 
WOOT does not require a central server. 

6 Conclusion 

Currently, WOOT is the only framework that preserves intention consistency on linear 
structures without central sites and vector clocks. Unlike OT algorithms, WOOT scales and 
is particularly adapted to large peer-to-peer networks. 

Although WOOT makes drastic choices on its data model i.e. no deletion of elements, 
unique identifiers for characters, the WOOT space complexity is less or equal to comparable 
OT frameworks. If we have not yet a complete proof, the WOOT framework has been 
formally verified with model checking tools. 

We have realized a simple implementation with threads in java that allows simulating 
any scenario. It can be downloaded from http://www.loria.fr/~molli/woot 

We are working now on a complete proof of WOOT correctness, group undo features , 
and WOOT support for XML tree. 
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A WOOT TLA specification 

- MODULE WOOt 

EXTENDS Naturals, FiniteSets, Sequences, TLC 


CONSTANTS 

Sites, 

MAXGEN, 
CB, CE 

VARIABLES 
string, 
pool, 
ccp, ccn, 
gens 


set of sites 

maximun generated number 


array of W-string 
array of pools of messages 
orginial previous-next 
generated chars 


Chars A 1 .. MAXGEN U {CB, CE} 
Perms = Permutations (Sites) 
less(c, d) = (c < d) 
find(seq, c&fisg ' 

CHOOSE iel.. Len(seq) : seq[i\ = c 


integrate(c, icp, icn, seq) = 
let int[cp, cn G Chars] = 
let i 1 = find(seq, cp) 

*2 = find(seq, cn) 

comp(d) = (find(seq, ccp[d]) < iVj A (*2 < find(seq, ccn[d])) 

IN 

IF («2 — *1) = 1 THEN 

[iel.. (Len(seq) + 1) if (i < i2) then seq[i] else 
if ( i = i2) THEN c ELSE seq[i — 1]] 

ELSE 

LET ic = CHOOSE V <c tl . . a2 — 1 : 

(* = il V ( comp(seq[i]) A less(seq[i\, c))) A 
V j € i + 1 .. *2 — 1 : comp(seq\j]) => less(c, seq\j]) 
id = choose i G ic + 1 .. i2 : 

(* = *2 V ( comp(seq[i]) A less(c, seqr[*]))) A 
VjGilJ-j i-l: comp(seq\j ]) => less(seq\j], c) 
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IN 

int[seq[ic], seg[*d]] 

IN 

int[icp, icn ] 
init = 

A pool = [s € Sites i—> {}] 

A ccp = [c £ 1 .. MAXGEN w. ()] 

A ccn = [c £ 1 .. MAXGEN t-* ()] 

A string = [s £ Sites i—> ( CB, CE )] 

A gens = {} 

sendlns(s, c, cp, cn) — 
let n = Len(string[s]) 

IN 

HI!| € 1 .. n - 1 : (cp = strm5[s]|p A 

3j 6 i + 1 .. n : cn = string[s]\$. 

A V * 6 1 .. n : string[s\[i ] ^ c 
Ac ^ gens 

A pool' — lie Sites i—*■ if i = s then pool[i] 

ELSE pool [i] U {("Ins”, c, cp, cn)}] 

A string 1 = [string except ![s] = integrate (c, cp, cn, string[s ])] 
A ccp’ = [ccp except ![c] = cp] 

A ccn' = [ccn except ![c] = cn] 

A gens' = gens U {c} 

A UNCHANGED {} 

getlns(s, c, cp, cn\ 
let msg = (“Ins”, c, cp, cn) 
n = Len(string[s ]) 

IN 

A msg £ pool[s] 

A 3 * £ 1 .. n : string[s][i] = cp 
€ 1 .. n : string[s][i] = cn 
A pool' = \pool except ![s] = pool[s] \ {msg}] 

A string' = [string except ![s] = integrated, cp, cn, string[s ])] 
A UNCHANGED {ccp, ccn, gens) 
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A Cardinality(gens'^MMAXGEN 
A Vs C Sites : pool[s] — :{} 

A cap’ = [c € 1 .. MAXGEN ()] 

A C cn' = [c e 1 .. MAXGEN i-c ()] 

A str-in#' = [s £ Sites bf (CB, CE)] 

A sens' = {} 

A UNCHANGED (pooi) 

wtNext, 

V 3 s £ Sites : 3 c £ Chars : 3 cp £ Chars : 3 cn £ Chars : 

sendlns{s, c, cp, cn) 

V 3 s £ Sites : 3 c £ Chars : 3 cp £ Chars : 3 cn £ Cftars : 

getlns{s, c, cp, cn) 


Conv = 

VS 1 £ Sites : VS2 £ Sites \ {SI} : 

Vil £ 1 .. £en(strin<?[Sl]) : V i2 £ 1 .. Zen(stnn<?[S2]) : 

Vfl £ 1 .. £en(strins[Sl]) : Vj2 £ 1 .. Len(string[S2}) : 
strtnj[Sl][il] = strins[S2][i2] A string[Sl]\jl] = string[S2]\j2] 
^((H<jl) = (i2<j2)) 


vars = {pool, string, gens, ccp, ccn) 
spec = init A n[i utNext]vars 

THEOREM spec => □ Conv 
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